85 research outputs found
Learning mutational graphs of individual tumour evolution from single-cell and multi-region sequencing data
Background. A large number of algorithms is being developed to reconstruct
evolutionary models of individual tumours from genome sequencing data. Most
methods can analyze multiple samples collected either through bulk multi-region
sequencing experiments or the sequencing of individual cancer cells. However,
rarely the same method can support both data types.
Results. We introduce TRaIT, a computational framework to infer mutational
graphs that model the accumulation of multiple types of somatic alterations
driving tumour evolution. Compared to other tools, TRaIT supports multi-region
and single-cell sequencing data within the same statistical framework, and
delivers expressive models that capture many complex evolutionary phenomena.
TRaIT improves accuracy, robustness to data-specific errors and computational
complexity compared to competing methods.
Conclusions. We show that the application of TRaIT to single-cell and
multi-region cancer datasets can produce accurate and reliable models of
single-tumour evolution, quantify the extent of intra-tumour heterogeneity and
generate new testable experimental hypotheses
Additional file 2: Figure S1. of Potential and active functions in the gut microbiota of a healthy human cohort
Principal component analysis plots related to taxonomic and functional features. MG data are in blue, while MP data are in red. Each dot (with different shape) represents a different human subject. (A) phyla; (B) genera; (C) KOGs; (D) KOG-phylum combinations. (PNG 2001 kb
Additional file 5: Dataset S2. of Potential and active functions in the gut microbiota of a healthy human cohort
Relative abundance and differential analysis outputs concerning Firmicutes and Bacteroidetes KOGs, according to MG and MP data. (XLSX 101 kb
Percentage of missing non-reference genotypes (i.e. false negatives) per individual in families for variants called by joint modeling family data and the standard approach of ignoring relatedness for sequencing coverage between 5Ă— and 30Ă— and for input sequence data with Phred-scaled quality of 20 (error rate of 1% per base) or 30 (error rate of 0.1% per base) without mapping error.
<p>For all scenarios 300 sequenced individuals were simulated.</p
Mismatch rates (%) of 4 categories of genotypes by the reference allele frequencies for pedigrees of quartet (two siblings and their parents) with base quality Q20 at 15Ă— without mapping error.
<p>The 4 categories are (A) overall genotypes, (B) homozygous alternative allele, (C) heterozygotes and (D) homozygous reference allele.</p
The receiver operating characteristic (ROC) curves of PolyMutt and the standard methods for <i>de novo</i> mutation (DNM) detection from empirically calibrated alignments of simulated reads with sequencing coverage of 30Ă— with base quality of Q20.
<p>PolyMutt (ignoring relatedness) and GATK calls were obtained by jointly calling a trio assuming individuals in a trio are unrelated using Polymutt and GATK respectively.</p
Number of false positive <i>de novo</i> mutations per billion bases detected by PolyMutt of jointly modeling for sequencing at coverage 5×–40× with Phred-scaled base quality Q20 (1% error rate) without mapping error in different pedigrees structures.
<p>Number of false positive <i>de novo</i> mutations per billion bases detected by PolyMutt of jointly modeling for sequencing at coverage 5×–40× with Phred-scaled base quality Q20 (1% error rate) without mapping error in different pedigrees structures.</p
Heterozygous mismatch rates (%) and Mendelian inconsistency rates (%) per site of call sets generated by PolyMutt (family-aware) and the standard approaches using PolyMutt (ignoring relatedness) and GATK from empirically calibrated alignments of simulated reads with base quality of Q20 in the pedigree shown in Figure 1.
<p>Heterozygous mismatch rates (%) and Mendelian inconsistency rates (%) per site of call sets generated by PolyMutt (family-aware) and the standard approaches using PolyMutt (ignoring relatedness) and GATK from empirically calibrated alignments of simulated reads with base quality of Q20 in the pedigree shown in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002944#pgen-1002944-g001" target="_blank">Figure 1</a>.</p
Genotype mismatch rates (%) for different family structures with sequencing coverage of 5Ă—, 15Ă—, and 30Ă— and input bases with Phred-scaled quality Q20 (1% error rate) or Q30 (0.1% error rate) without mapping error.
<p>The mismatch rates are shown for 4 genotype categories: all genotypes (All), homozygous reference allele (HomRef), heterozygotes (Het), and homozygous alternative allele (HomAlt).</p
Three-generation extended pedigrees.
<p>A) is a 3-generation extended pedigree with numbers labeling the individual heterozygous genotype mismatch rates (%) at coverage of 15Ă— with base quality of Q20 without mapping error and panel B) labels the corresponding mismatch rates for the standard approach of ignoring relatedness. Panel C) and D) display the heterozygous mismatch rates (%) when a fixed sequencing effort of 150Ă— is allocated differently to family members: Panel C) is for the situation where the founders are allocated 30Ă— while non-founders have 5Ă— and in Panel D) founders and non-founders have coverage of 6Ă— and 21Ă— respectively.</p
- …